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Abstract. We consider the problem of determining m n , the number of matroids on n elements. 
The best known lower bound on m„ is due to Knuth (1974) who showed that log log m n is at least 
n— | log n—0(l). On the other hand, Piff (1973) showed that log log m n < n— log n+loglog n+0(l), 
and it has been conjectured since that the right answer is perhaps closer to Knuth's bound. 

We show that this is indeed the case, and prove an upper bound on log log m n that is within 
an additive 1 + o(l) term of Knuth's lower bound. Our proof is based on using some structural 
properties of non-bases in a matroid together with some properties of independent sets in the 
Johnson graph to give a compressed representation of matroids. 



1. Introduction 

Matroids, introduced by Whitney in his seminal paper [26], are fundamental combinatorial ob- 
jects and have been extensively studied due to their very close connection to combinatorial opti- 
mization, see e.g. [23], and their ability to abstract core notions from areas such as graph theory 
and linear algebra [21] . 

There are several ways to define a matroid. Perhaps the most natural one is using the notion 
of independence. A matroid M is a pair (E,I), where E is the ground set of elements, and X is a 
nonempty collection of subsets of E called the independent sets with the following properties: 

(1) Subset property: A £ I implies A' £ I for all A' C A, and 

(2) Exchange property: If A,B G X with \A\ > \B\, then there exists an element x in A \ B, 
such that B U {x} £ X. 

A basic question is: how many distinct matroids can there be on a ground set of n elements? 
We denote this number by m n . Clearly, there are 2 n subsets of E and hence at most 2 2 ways to 
choose X, which gives the trivial upper bound log log m n < n. Here, and throughout the paper, log 
denotes the logarithm to the base 2. 

This bound is easily improved to log log m n < n — ifogn + 0(1) by focussing on matroids of 
a fixed rank. In a matroid, the maximal independent sets are called bases, and by the exchange 
property all bases of a matroid have the same cardinality. This common cardinality is the rank 
of the matroid. Let m n . r be the number of matroids of rank r. As m n = m n $ + . . . + m nn , it 
must hold that m n , r > m n /{n + 1) for some r. By the subset property, any matroid of rank r is 
completely determined by specifying its bases. As there are at most (") < (i n / 2 j) = 0(2 n /y/n) 
(call this €) such bases, this gives m n ^ r < 2^ and thus 

log log m n < log log((n + l)m n ^ r ) < log log((n + 1)2^) = n — — log n + 0(1). 

In 1973, Piff [22j improved this bound further to log log m n < n — logn + log log n + O(l), by 
observing that a matroid is also completely determined by the closures of its circuits, and using a 
counting argument to show that there "only" 0(2 n /n) such closures (we describe Piff's proof in 
section [231) . This is the best upper bound known to date. 
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In the other direction, the best known lower bound is due to Knuth p3] from 1974, who showed 
that log log m n > n — | log n — O(l). Knuth's bound is based on an elegant construction of matroids 
whose non-base^ satisfy a particular property. Specifically, he constructs a large family of so-called 
sparse paving matroids. These are matroids of rank r, where any two non-bases of size r intersect 
in at most r — 2 elements (i.e. their incidence vectors have Hamming distance 4 or more). Such 
sets of non-bases are precisely the independent sets in the so-called Johnson graph J(n,r). This is 
the graph with vertex set (^), in which two vertices are adjacent if and only if their intersection 
contains r — 1 elements. 

Knuth's bound follows by taking collection of k = ^(|_ n / 2 j) sucn non-bases, equivalently, an 
independent set in the graph J(n, n/2) of this size (section 12.41 has an explicit description of this 
set) and considering the family of size 2 k of sparse paving matroids obtained by taking each possible 
subset of this family. Thus m n > s n > 2 k , where s n is the number of sparse paving matroids on n 
elements. This gives the lower bound 

3 12 

(1) log log m n > log log s n > log/c = n- -logn+ - log o(l). 

Z Z 7T 

We explain Knuth's bound in more detail in section 12.41 

Historically, the interest in paving matroids seems to be a response to the publication of the cata- 
log of matroids on at most 8 elements by Blackburn, Crapo, and Higgs [4J in the early 1970's. With 
reference to such numerical evidence, Crapo and Rota consider it probable that paving matroids 
"would actually predominate in any asymptotic enumeration of geometries" [8] p. 3. 17]. In his book 
"Matroid Theory", Welsh also notes that paving matroids predominate among the small matroids, 
and puts the question whether this pattern extends to matroids in general as an exercise \25\ p. 41]. 
An earlier lower bound on the number of matroids due to Piff and Welsh [23] was also based on a 
bound on the number of (sparse) paving matroids. Mayhew and Royle recently confirmed that the 
predominance of sparse paving matroids extends to the matroids on 9 elements |18] , 

In recent years, (sparse) paving matroids have received attention in relation to a wide variety 
of matroid topics [121 [9] [20], [5] . These authors all suggest that the class of sparse paving matroids 
is probably a very substantial subset of all matroids, pointing out Knuth's argument for the lower 
bound. 

Mayhew, Newman, Welsh and Whittle |16] present a very nice collection of conjectures on the 
asymptotic behavior of matroids. In particular, they conjecture that asymptotically almost every 
matroid is sparse paving: 

Conjecture 1 (Mayhew, Newman, Welsh and Whittle [S]). lim n _ 5 . 00 s n /m n = 1. 

If true, this would imply 

Conjecture 2. log log m n = log log s n + o(l). 

Note that this is in fact a much weaker statement as log log(-) is a very "forgiving" function, e.g. if 
m n = £l(ns n ) or even if m n = £l(2 2s/ " s n ), then ^ — > 0, while still log log m n = log log s n + o(l). 

1.1. Our results. Our main result is a substantial strengthening of the upper bound on m n . 
Specifically, we show that 

Theorem 3. The number of matroids m n on n elements satisfies 

3 12 
log log m n < n logn H — log h 1 + o(l). 



For a matroid of rank r, a non-base is a r-subset of the ground set that is dependent. 
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Combining theorem [3] with Knuth's lower bound JT|) on the number of sparse paving matroids 
s n , this gives 

Corollary 4. loglogm.„ < log log s n + 1 + o(l). 

Thus, this result comes quite close to conjecture^ except for the additive +1 term. In particular, 
it implies that the number of matroids is indeed much closer to Knuth's lower bound, and perhaps 
also lends support to the conjecture that most matroids are indeed sparse paving. 

1.2. Our Techniques. The proof of theorem [3] is based on a combination of the following: 

(1) Techniques for proving refined upper bounds on the total number of independent sets in a 
graph. 

(2) Defining a notion of a local cover of a matroid, which serves as a short certificate to identify 
the bases in the neighborhood of an r-set. Combining the local covers for a carefully chosen 
set of r-sets then serves as a compressed representation of any matroid. 

To see the connection to the total number of independent sets, note that any upper bound on 
m n is also an upper bound on s n . As s n = s n fi + s n> i + . . . + s Htn , where s njr denotes the number of 
sparse paving matroids of rank r, and s„ ir is precisely the total number of independent sets in the 
Johnson graph J(n,r), any method to upper bound m n must also bound the number of such sets. 

We first give an overview of each of these two ideas, and then describe how these are combined 
to prove theorem These ideas are already useful by themselves to improve the currently known 
bounds on s n and m n . In section [3] we show how local covers can be used in a very simple way to 
obtain the bound 

Theorem 5. log log m n < n — | log n + 2 log log n + O(l) . 

While this bound is weaker than the one in theorem [31 it already improves Piff's upper bound 
substantially, and matches Knuth's lower bound up to the additive O(loglogn) term. 

Similarly, in section [5] we show how the refined counting technique for independent sets implies 

Theorem 6. log log s n < n — | log n + | log - + 1 + o(l) . 

Previously, the best known upper bound on s n seems to be log log s n < n — | log n + 0(log log n) 
|19j (we sketch an argument below). 

Finally, we prove theorem [3] in section [6l 

1.2.1. Upper-bounding m n via local covers: Let m n ^ r denote the number of matroids of rank r on n 
elements. As m n = m nj o + . . . + m ntn , it suffices to bound each m n ^ r separately. For a matroid of 
rank r, let us call a collection of flats a flat cover if it completely describes the matroid by certifying 
for each r-set whether it is a basis or not. 

A related notion is that of a local cover: a collection of flats that allows us to identify the bases 
in the neighborhood of some fixed r-set. Our main observation is that given any matroid, for every 
r-set, one can associate to it a local cover consisting of at most r flats. This implies that if we pick 
any dominating set D in the Johnson graph and list all the local covers for the vertices in D, then 
this gives a valid flat cover consisting of at most \D\r flats . Together with standard arguments 
about the existence of small dominating sets in any regular graph, this implies that each matroid 
M e M n r can be described by a "small" flat cover, which gives the bound in theorem [5l 

1.2.2. Upper-bounding s n via independent sets: As s n = s n fl + s nj i + . . . + s n ^ n it suffices to bound 
each of these terms separately and we focus on the case of r = n/2, as this term has the largest 
contribution to s n . For a graph G, let i(G) denote the number of independent sets in G, and recall 
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that s nr = i(J(n, r)). While it is hard to obtain any reasonable estimate of i(G) for general graphs, 
it was shown in [19J that 

3 

(2) log log s n < n — — log n + log log n + 0(1) 

One may argue this as follows. Let G = (V, E) be a d-regular graph and —A denote the smallest 
eigenvalue of its adjacency matrix. Then the size of maximum independent set of G is at most 
|V|A/(e2 + A) by Hoffmann's bound (see e.g. theorem 3.5.2 of [7J, or our corollary [T8|) . Let us denote 
a = X/(d + A). This implies that 



a\V\ 

(3) i(G) < V 



3=0 



For the graph J(n,n/2) it is known that a = (2 + o(l))/n, which implies that the maximum 
independent set has size at most aN where N = ( n ™ 2 ) • Note that this bound is quite good and is 
within a factor 2 + o(l) of the size of the explicit independent set used in Knuth's lower bound. 
Applying ([3]) to J(n,n/2) then gives s n n/ / 2 = exp (2 + o(l))N log n/n, which implies the bound ([2]). 
We note that the proof of ([2]) in j!9j is similar, except that there the same bound on the maximal 
size of an independent set of J(n,n/2) was shown by a combinatorial argument. 

It turns out however that counting all the subsets in ([3]) is rather wasteful and that this bound 
can be improved. In particular, we show that 

Theorem 7. If G is a d-regular graph on N vertices with smallest eigenvalue —A. Then 
where a = ^ and a = 

For the graph J(n,n/2), a < 8 ^ n and hence this gives the stronger bound i(G) < 2^ 2+ °^ N ^ n . 
As aN = (2 + o(l))N/n was our bound on the size of the maximum independent set, this bound on 
i(G) roughly implies that most independent sets occur as subsets of a few large independent sets 
of size aN. Using standard bound on the binomial coefficients, this directly implies theorem [6l 

Our proof of theorem [7] is based on a procedure for encoding independent sets that is originally 
due to Kleitman and Winston [13]. We remain very close to the description of the procedure 
as given in Alon, Balogh, Morris, and Samotij [2], see also this paper for detailed references on 
the earlier uses of the procedure. Compared to [2], we have given a somewhat improved analysis 
(specifically lemma [TB]) to obtain a sufficient bound in the parameter range that is of interest to us. 

1.2.3. The improved upper bound on m n : To obtain the bound in theorem [31 we combine the two 
ideas above. The main observation is that given a matroid M, if X is a dependent r-set (i.e. a 
non-basis) in M, then X has a local cover consisting of at most 2 flats (as opposed to up to r flats if 
X was an arbitrary r-set). Thus if we could construct a flat cover using few such local covers, then 
we would obtain a much smaller description of a matroid. To this end, we generalize the procedure 
of Alon et al. [2] for encoding independent graphs to more generally encode flat covers of the kind 
described above using a few number of bits. This gives the improved bound on m n ^ r and hence on 
m n . 

Finally, we remark that the +1 additive gap in our upper bound on m n arises only because of 
the factor 2 + o(l) gap between the known upper and lower bounds on the size of the maximum 
independent set in the graphs J(n,r) for r ~ n/2. It is likely that reducing this gap could lead to 
improved bounds for m n . In section [7J we elaborate on this issue a bit further. 

4 



2. Preliminaries 



2.1. Matroids. As mentioned previously, a matroid M is specified by M = (E,I), where the 
sets in the collection I satisfy the independence axioms. The elements of I are independent, the 
remaining elements of 2 E \ X are dependent. The set E is the ground set, and we say that M is a 
matroid on E. There are various setsystems and functions defined on M that each allow one to 
distinguish between dependent and independent sets, such as the set of bases, the rank function, the 
circuits, the closure operator, etc. We define these notions and state some of their basic properties 
here, but for a detailed account of their interrelations and for proofs we refer to Oxley |21j . 

A basis of M is an inclusionwise maximal independent set of M. By the independence axioms, 
each basis has the same cardinality. In this paper, we will present matroids as M = (E,B), where 
B is the set of bases of M . The following is an alternate characterization of matroids in terms of 
the basis axioms, which we shall need later. A set B C 2 E is the set of bases of a matroid on E if 
and only if B ^ and B satisfies the basis exchange axiom 

(4) for each B,B' G B and each e G B\B' there exists an / G B' \ B such that B - e + / G B. 

Here, we write X + y := X U {y} and X — y := X \ {y}. 

The rank of a set X C E is r^f(X) := max{|/| /CI, I i.e. the cardinality of any 

maximal independent set in X. The rank function is submodular: 

r M {X n Y) + r M (X U Y) < r M (X) + r M (Y). 

We write r(M) := rM(E). Then r(M) is the common cardinality of all bases, the rank of M. We 
say that an r-set X is a non-basis if tm(X) < r. Clearly, a matroid of rank r with set of bases B 
is also uniquely defined by its set of non-bases, ( ) \B. 

A circuit of M is an inclusionwise minimal dependent set of M. We denote the set of circuits of 
M by C(M). By definition, each dependent set contains some circuit. We will use that if X is an 
r-set with r^(X) = r(M) — 1, then it contains a unique circuit CCI. 

In M, the closure of a set X C E 1 is the set cljvr(X) := {e G -E 1 | ru{X + e) = rjw(X)}. We will 
often use that rjw(cljw(X)) = r^iX) for any set X, which follows easily from induction and the 
submodularity of the rank function. A set F C E is called a flat of M if cIa^(-F) = F, and F'(M) 
denotes the set of all flats of M. As cljw(cljw(X)) = cljvf(X) for any set X, every closure cIm(X) 
is a flat. 

The following simple property of flats will be crucially used in our construction of flat covers: A 
set X C E is dependent if and only if there exists a flat F such that |X D F\ > rj^(-F). In other 
words, F acts as witness that X contains a dependency when restricted to F. 

The dual of M is the matroid M* whose bases are B* = {E \ B \ B G B}. The bases, circuits, 
rank, and closure of sets in M* are called the cobases, cocircuits, corank, and coclosure of sets in 
M, and we write r* M (X) := r M *(X), C*(M) := C{M*), c\* M := cl M *, etc. 

The rank and corank functions of M are related by 

(5) r* M (X)=r M (E\X)-r(M) + \X\. 
We write 

M n := {M a matroid | E(M) = {1, . . . , n}}, M n>r := {M G M n | r(M) = r}. 

Also, we put m n := |M n |, m nyT := |M n)J .|. 

A matroid M is paving if \C\ > r(M) for each circuit C of M (or equivalently if there is no 
dependent set of size < r(M)), and sparse if M* is paving. M is said to be sparse paving if it is 
both sparse and paving. We write 

s n := \{M G M n | M is sparse paving} |, s n ^ T := \{M G M n>r | M is sparse paving} |. 
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2.2. Bounds on binomial coefficients. We will frequently use the following standard bounds. 

n\ /en\ r 



(<)) I,.]' V 




We will also use the following bounds on the sum of binomial coefficients. 



(8) £ 

i=0 

In particular if k = o(n), then 



k 

'n\ n — (k — 1) fn 



ij ~ n - (2k - 1) \k 



{X Q E\\X\ = r} 



i=0 v 7 v 

2.3. The Johnson graph. If E is a finite set and r < \E\, then we write 

r 

for the collection of r-subsets of E 1 . We say that X, Y E ( ) are adjacent (notation: X ~ Y) if 
they have Hamming distance |XAY| = 2 (or equivalently: |X n Y"| = r — 1). The Johnson graph 
J(E,r) is defined as the graph with vertex set ( ), in which two vertices X and Y are adjacent 
if and only if X ~ Y. We abbreviate J(n, r) := J([n],r). For any r-set X £ ( V we write 
N(X) := {Y £ (f) | X ~ y} for the neighborhood of X in J(£,r). Obviously, J(£,r) ^ J(n,r) 
for any n-set -E. 

The following lemma points out the connection between the Johnson graph and sparse paving 
matroids. It was essentially shown by Piff and Welsh [23J in proving an earlier lower bound on s n . 

Lemma 8. For < r < n, sparse paving matroids M € M. n ^ r correspond one-to-one to independent 
sets in J(n, r). 

Proof. Let E = [n]. We first show that the non-bases of a rank-r sparse paving matroid M on E 
form an independent set in J(E, r). Suppose that there are non- bases X, Y £ ( E r )\B(M) such that 
X ~ Y, then we would have 

r M (X n Y) + r M (X U Y) < r M (X) + r M (Y) < 2r - 1, 

so that either vm(X n 1") < r — 1 = \X D Y"| or rM{X L)Y)<r. In the former case, X n 7 is a 
dependent set of size < r(M), which contradicts that M is paving. In the latter case, it follows 
from ([5]) that 

r* M {E\(XUY)) = r M (XUY)-r(M) + \E\(XUY)\ < r-r + \E\(XUY)\ = n-r-1 = r*(M)-l, 
so that E\ (XUY) is a dependent set of M* of size < r(M*), which contradicts that M* is paving. 

Next, suppose that I is an independent set in J(E,r). We will show that B := ( E ) \ I forms a 
valid collection of bases for some matroid on E. 

First, it cannot be that B = as this would imply that /= ( J and hence that J(E,r) has no 
edges. So the only way B may fail to be a basis is if it fails the basis exchange axiom (jl]). That is, 
there are distinct B,B' € B and an e € B \ B' such that B - e + / B for all / G B' \ B. Now, it 
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must be that \B'\B\ > 1, for otherwise it holds that B - e + / = B' G B for the only / G B'\B. 
So, let /, /' be distinct elements of B' \ B, and consider N = B — e + f and N' = B — e + f . Since 
the base exchange axiom fails both N,N' G J. On the other hand, \NAN'\ = |{/, f'}\ = 2, i.e. 
N ~ N', contradicting independence of I. □ 

2.4. Knuth's lower bound. In [14J, Knuth argues that if J(n,r) has an independent set I of size 
k, then J(n, r) has at least 2 k independent sets, as each subset of I is itself independent. Knuth 
constructed an independent set of size k = ?p , but theorem 1 in |10j shows the existence of an 
independent set of size at least k = — (") . 

We sketch the construction in [TO]. Identifying the vertices of J(n, r) with their incidence vectors, 
we view them as {0, 1} vectors (x±, . . . , x n ) with exactly r l's. It is easily verified that the functional 
{0, l} n -> Z/nZ, defined by 

n 

(xi, X2, ■ ■ ■ , x n ) i— )• zxj mod n 
i=l 

gives a valid n- vertex-coloring of J(n,r). As there are n color classes, at least one of them should 
contain at least ^(™) vertices. 

Picking such an independent set gives log(s„ )r ) > ^ (™) , and in particular log(s n ) > log(s nj [ n /2j ) > 

^.(^l-o(l)) by©. Therefore, 

3 12 

(10) log log s n > n- -logn + -log o(l). 

2.5. PifF's upper bound. To prove his upper bound on m n , Piff uses that any matroid M is 
characterized by the set of all closures of circuits and their ranks, i.e. by the collection 

(11) /C(M) := {(cl M (C), r M (C)) \ C a circuit of M}. 

This completely defines M as a set X C E{M) is dependent in M if and only if \X n cl^(C)| > 
?~m(C) for some circuit C of M. He then uses the following counting argument to bound the size 
of/C(M). 

Lemma 9. If M G M n , then \K{M)\ < ^2 n+l . 

Proof. Fix an i < n. Let C G C{M) be a circuit such that |C| = i + 1. Then for each e G C, we 
have cl m(C) = c\m(C — e), i.e. there are i + 1 sets C — e G ( • ) that map to c\m(C). It follows that 

1 fn\ 1 /n + 1 N 



|{(cl M (C),r)G/C(M)|r = i}|< . . 

? + 1 \«y n + 1 \ « 

Summing these upper bounds over all i, we get 

n— l n—l / _i_ i \ 1 

|/C(M)|=^|{(cl M (C),r)G/C(M)|r = i}|<^— - n+ < — 2-+ 1 . □ 

It follows that the number of matroids on a set E of n elements is at most the number of subsets 
KC2 £ x{0,...,n} with \K\ < ^2 n+1 . Using fl6j and ©, we have 

lo gm „ <lo g (d + oWlf?^)) < ^2»«l„ g ^±i£ +o( l) 
and hence log log m n < n — log n + log log n + 0(1). 
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3. A WEAKER UPPER BOUND ON THE NUMBER OF MATROIDS 



In this section, we introduce the notion of flat covers and local covers and use them to show that 
each matroid in WK-n f has a concise description. Using this, we then bound m n ^ r . 

Definition 10 (Flat cover). Let M = (E,B) be a matroid with n elements, of rank r. For a set 
X C E, we say that a flat F £ F(M) covers X if \F n X\ > Tm{F). We say that a set of flats Z 
is a flat cover of M if each non-base X £ ( E ) \B is covered by some F 6 Z. 

Note that if Z covers M, then M is characterized by E, r and the collection 

{(F,r M (F)) \ F e Z}, 
since by definition of a cover, we have B = {X £ (f ) | \X n F\ < r M (F) for all F £ Z}. 

Definition 11 (Local cover). For a r-set X £ (^), we say that a collection of flats Zx Q F{M) 
is a local cover at X if Zx covers all the non-bases Y £ (N(X) U {X}) \ B. 



Lemma 12. Let M £ M njr . For each r-set X £ ( there is a local cover Zx such that \Zx\ < r. 

Proof. Let X be some fixed r-set. Take Zx '■= {c1m(^ — x) \ x £ X}. Then clearly \Zx\ < r. We 
consider a Y £ N(X) U {X}. 

If Y = X and X is dependent, then X C cIm(X — xo) for some xq £ X. Then cLm(^ — xq) 
covers X, as 

\X n cl M (X - x )\ = \X\ = r > r M (X) > r M (X - x ) = r M (cl M (X - x )). 

If Y £ N(X), then Y = X — x + y for some x £ X and y £ E \ X . If c1m(^ — x ) covers Y, we 
are done. Otherwise, 

r-l = \X -x\< \cl M {X -x)r\Y\< r M (cl M (X - x)) < r - 1 

so that equality holds throughout, and in particular tm{X — x) = r — 1 and y cl]\f(X — x). It 
follows that rjif(y) = ru(X — x + y) = r, so that Y is a basis and it is not required to cover Y. □ 

If G = (V, E) is a graph, then a set D C V is dominating if D U N(D) = V. The point of 
introducing local covers is that one can construct a small flat cover from a collection of local covers 
at the vertices in some small dominating set, as every non-basis in the matroid will be covered by 
this collection. By standard probabilistic arguments (see theorem 1.2.2 of |3j), one has: 

Lemma 13. J(n, r) has a dominating set of cardinality ln ^™~_^Jy^i +1 (") • 
Corollary 14. Let M £ M n , r . Then M has a flat cover Z with \Z\ < r MKn-r)+i)+l 

Proof. By lemma [T3| J(n, r) has a dominating set D with \D\ < ln - ^fe -r\+i + 1 (") ■ ^ or eacri X £ D, 
let Zx be a local cover of M as in lemma [T2l Then \Zx\ < r for each X £ D. Take Z := IJxeD 
Then Z is a cover of M, and < r\D\. □ 

Theorem [5]. log log m n < n — | log n + 2 log log n + O(l) . 

Proof. Denote the upper bound in corollary 1141 by k n)T := r ln ^™~_^J^^ +1 (") . As each matroid in 

M. n ^ r is uniquely determined by the set {(F, rM^F)) \ F £ Z} C2 £ x {0, . . . , n}, where Z is a cover 
of size bounded by &) njr , the number of matroids in M njr is bounded by the number of subsets of a set 
of size 2 n (n + l) of cardinality at most k nr , so by Q we have m n ^ r < ( 2 ^ +1 ^)l + o(l)/c njr (l + o(l)). 



Now, for any r < n/2, 



4 In ft ( Tt \ 

So, by © and ([7]), it follows that for r < n/2 

, e2 n (n + l) 41nn/ n \ 5 + o(l) 

log m n , r < fc n Ln/2 j log — < log n. 

Kau/2\ n \[n/2\J 2 



The same applies if r > n/2, as m njr = m n ^ n _ r due to matroid duality. As m n = Y^ r f^n-r, we have 

3 
2 



log log m n < n — | log re + 2 log log n + 0(1) as required. □ 



The difference between this upper bound and the lower bound of Knuth is 2 log log re + O(l). An 
upper bound on the cardinality of a dominating set of the Johnson graph J(n, r) that is closer to 
(r)/( r ( n — r ) + 1) could improve this gap to log log n + 0(l) at best. We cannot expect to do better: 
applied to bound the number of sparse paving matroids, the above proof is inherently as wasteful 
as using ([3]). Note that a minimal cover of a sparse paving matroid just lists the non-bases. We 
proceed by describing a better technique for bounding the number of sparse paving matroids. 



4. A PROCEDURE FOR ENCODING VERTEX SETS 

4.1. The procedure. We now describe a procedure given by Alon, Balogh, Morris and Samotij [2], 
for which they refer to Kleitman and Winston [13] as the original source. They use the procedure 
to encode an independent set I as a pair (5, I \ S), such that S C I and the number of possibilities 
for both S and I \ S can be controlled. We will use it for that purpose in this section as well, but 
to prepare for other uses in this paper we generalize the procedure so that it takes a general vertex 
set K and produces a pair (S,A), satisfying 

(12) S QK <Z SUN(S)UA. 

We stress that the encoding is not one to one, and several sets K may produce the same pair (S, A). 
We will later describe why such a pair (S, A) is useful. 

Throughout this section, G is a d-regular graph on vertices, and the smallest eigenvalue of its 
adjacency matrix is —A. We denote a := For a subset A C V, let G[A] denote the subgraph of 
G induced by A. Let e(A) denote the number of edges with both end points in A, i.e. the number 
of edges in GL4]. Let us assume there is some fixed linear ordering <y of the vertices of G (say 
according to their indices 1, . . . , N). By the canonical ordering of A C V, we refer to the following 
procedure to order the set A linearly. Let v be the vertex with maximum degree in GL4]; if there 
are multiple such v, take the one that is smallest with respect to <y. Call v the first vertex in the 
canonical ordering, and apply the procedure iteratively to A \ {v}. 

The procedure to produce (S, A) (see Figure [Q maintains two disjoint sets of vertices: S for 
selected and A for available. Initially, no vertices are selected (S = 0) and all vertices are available 
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{A = V). During the procedure, the set S will expand and the set A will shrink, until \A\ < aN. 
Throughout we will maintain (|12p as an invariant. 

Input :G=(V,E),KQV 
Output: (S,A) 

Set A «- V and 5^0; 
while \A\ > aN do 

Pick the first vertex v in the canonical ordering of A; 

if v £ K then 
| set S <r- SU {v} and set A «- A \ (N(v) U {v}); 

else 

| setA^A\{w}; 
end 
end 

Output (S,A). 

Figure 1: The encoding procedure. 

The following is a simple but subtle and crucial observation from [2]. 

Lemma 15. Upon the termination of the algorithm, the set A is completely determined by S 
(irrespective of the set K). 

This follows as at any step in the algorithm, the vertices chosen thus far in S completely determine 
the remaining vertices and their ordering. In particular, given S one can recover A as follows: 
Initialize X = V and T = S. Repeating the following steps until \X\ < aN (the resulting set X 
when the algorithm terminates will be A), (i) Consider the canonical ordering of X, and let v be 
the first vertex in this ordering, (ii) If v £ T, discard v from T and {v} U N(v) from X and go to 
step 1. Otherwise, discard v from X and go to step 1. 

4.2. Application to counting independent sets. Later we will show that 

Lemma 16. The number of vertices selected into S is at most \—^fx^N~\ . 

Let us first see how this implies the following upper bound on i(G), the number of independent 
sets in G. 

Theorem [71 If G is a d-regular graph on N vertices with smallest eigenvalue —A. Then 

!(G)s ^(k"i) 2 °" 

where a = ^ and a = ^g±ii. 

Proof. Let K be any independent set. Running the procedure yields a pair S,A with \A\ < aN, 
such that (i) A is completely determined by S (by lemma [T5|) and (ii) S C K C S U N(S) U A. 
Now, since K is an independent and S C K, we have N{S) n K = 0. Together with (ii) above, this 
implies that (S U N(S)) (1 K = S. Thus, K = S U (K n A) and hence K is completely determined 
by S and K n A 

As A is completely determined by S, for a fixed S, there are at most 2 aN possibilities for 
K n A. Moreover, as \S\ < \o~N~\, the number of ways of choosing S is at most ^1=^ (^) < 
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4.3. Analysis. We now prove lemma [T6j We first need the following lemma that was proved by 
Alon and Chung in [TJ, and earlier by Haemers (theorem 2.1.4 (i) of [11J). We use the version of 
the lemma stated in [2]. 

Lemma 17. For all A C V(G), we have 2e(A) > \A\ ^\A\ - X^j^ 

Proof. Let x denote incidence vector of set A, and let B denote the adjacency matrix of G. Then 
the number of edges is given by (l/2)x T Bx. Let v be the all l's vector scaled by |A|/iV, then 
v ■ (x — v) = 0, and v is an eigenvector of B with eigenvalue d. As B is symmetric its eigenvectors 
are orthogonal and hence v _L (x — v ) implies that v _L B(x — v). Thus 

2e(A) = x T Bx = (x — v + v) T B{x — v + v ) 

= (x — v) T B(x — v) + v T Bv 

> -X\\x-vf + d\\vf 

, , (N-\A\)\A\* L4|(iV-L4|) 2 \ d \A? 



□ 



N 2 N 2 N 



\ N N 



Corollary 18. For any e > 0, if \A\ = (a + e)N, then G[A] contains a vertex of degree at least 
e(d + X). 

Proof. Let A be a vertex set of size (a + e)N. By lemma [T71 2e(A) > |A|(<i + A)e. Hence the 
average degree in G\A] is at least e(d + A), and so there must be some vertex in G[A] of degree 
>e(d + X). □ 

In particular, it follows that an independent set A C V(G) has size at most aN. 

Proof of lemma \lb\ Say that the procedure is in phase j if the current A satisfies 

l ' G | a + 4^,a + ^-r- • (j = d,d-l : . ..,!)■ 



N V d + X 7 d + X_ 

Then each phase sees the removal of at most N/(d+X) vertices from A. By corollary 1181 any vertex 
that gets selected into S during phase j has degree > j — 1, hence removes at least j + 1 vertices 
from A (the vertex selected into S, and its neighbors). Let S(j) be the set of vertices that get 
selected into S during phase j, then the above argument shows that \S(j)\ < ^ + x^(j+i) + " J 1 
where Uj is the fractional number of vertices that get removed in phase j — 1 due to the insertion 
of a vertex in S(j). It follows that 

1 1 uyi _ d + A^-fj + l f-f 7 + 1 d + A 

as < < j + 1, no = and Y2i=i \ — 1 + hi fe. The lemma follows. □ 
5. An upper bound on the number of sparse paving matroids 

As mentioned previously, sparse paving matroids of rank r on groundset [n] correspond one-to- 
one with independent sets in the Johnson graph J(n,r). Thus s n)T = i(J(n,r)), and we may apply 
theorem [7] to bound the number of sparse paving matroids. We first investigate the parameters 
that occur in this application of the theorem. 

li 



Recall that J(n, r) is r(n — r)-regular and has (™) vertices. The eigenvalues of the adjancency 
matrix of J(n, r) are r{n — r) — i(n + 1 — i) for i = 0, 1, . . . , r (see [6]). This identifies the smallest 
eigenvalue as —\ n ,r, where 



(13) \ r 

Define 



§ (§ + l) — r { n — r) if n is even. 

a r , 



r(n - r) + A„ ir ' 

If e is such that r = §(1 + e), then a straightforward calculation shows that 



2 o 2 , 2t* , o 

«n,r <-+£=- + ( 1). 

n n n 

Lemma 19. a n , r (») < (§ + (^ 2J ) . 

Proof. (Sketch) It can be shown that the maximum of the function e h-> (^ + £ 2 ) (n(i +£ y) is achieved 
at e = 0. In particular, the two terms in the function behave as (- + e 2 ) < |;(e e2n / 2 ) and 

n \ < ( j_v n/2 ( n ^ we -«w n \ 

n(l + e)/2) - \l + e) \n/2j W 2 / 
and hence the term e e2n / 2 essentially cancels out. We omit the details. □ 

This gives us sufficient control over the parameters to prove theorem [6] from theorem [71 
Theorem El log log s n < n — | log n + \ log ^ + 1 + o(l) . 

Proof. By theorem [TJ we have 

s n , r = i(J(n,r)) < \a n , r N] ( , N ) F»* N 

where N = (™) and a n ^ r = • Taking logarithms and applying ([6]) to the factor (^ N N -\), 

we obtain log s n ^ r < log N + \a n ,rN~\ log(ra 2 ) + a n ^ r N. 

As \cr n ^N] < 4 ln ^z ^ ( I n /2j ) > an application of lemma [T9l shows that 

, f n \ 41n(ra 2 )/ n \ , 9x /2 1 W n 
log S „, < log j + ^( Ln/2J J log(n 2 ) + [- + - 2 ) [ [nm 

As the latter term is substantially larger than the first two, we have logs njr < f (i n /2j)(i- + °(1))- 
As s n = Ylr=i s n,r < (n + 1) max r s njr , we also have log s n < - (i n / 2 j) (1 + °CQ)- Taking logarithms 
and applying (J7J) to bound the binomial coefficient, the result follows. □ 



6. An upper bound on the number of matroids 

We will now show the upper bound on m n claimed in theorem [3l To do this, we first show that 
substantially better local covers at X exist if A is a non-basis. Later, we combine this fact with 
the encoding procedure in section U] to find a very concise encoding of a matroid M £ M njr . 
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6.1. Improved Local Covers. 

Lemma 20. Let M G M n>T .. For each r-set X G ( r J i/iai is dependent in M, there exists a set 
Zx Q J~(M) such that each non-basis Y G N(X) U {X} is covered by some F G o,nd \Zx\ < 2. 

Proo/. Let X be some fixed r-set. If r M {X) < r - 1, take Z x := {cl M (X)}. Then if Y € N(X) or 

Y = X, we have 

|cl M P0 n Y\ > \X n Y| > r - 1 > r M (X) = r A ,(cl M P0). 

If rM(X) = r — 1, then X contains a unique circuit C of M. Take Zx ■= {c1m(C)> cIm(X)}. If 

Y G X(X) is not a basis, then by submodularity 

r M (X U Y) + r M (X n Y) < r M (X) + r M (Y) < 2r - 1 

so that U Y) < r or r/w(-^ Pi Y) < r — 1. In the former case, we have Y C c1a/(X), hence 

|cl M P0 n Y| = r > r M (X U Y) > r M (X) = r M (cl(X)). 

In the latter case, X n Y is dependent and hence must contain a circuit C , and as C is the unique 
circuit contained X, we must have C = C. Then |c1m(C) H Y| > |C| > ru{C) = rM(d-M{C)). □ 

6.2. Matroid Encoding. The crucial difference from lemma [12] is the assumption that X is a 
dependent set in M. This allows us to obtain a much smaller bound on the size of a cover of 
M, if we can identify a small collection of non-bases of M such that their neighborhood contains 
all non-bases in a large fraction of the r-sets. This is exactly what the encoding algorithm will 
accomplish. We now give the details. 

Theorem [3]. The number of matroids m n on n elements satisfies 

3 12 
log log m n < n - - logn + - log — h 1 + o(l). 

Z Z 7T 

Proo/. Consider a matroid M G M njr , and let K := (^) \ B(M) be the set of its non-bases. Then 
K is a set of vertices of the graph G = J(n,r). As before, let N = (™) denote the number of 

vertices of G, let d = r(n — r) be its degree, let a = and a = ^pr^> where —A is the smallest 
eigenvalue of G. 

We describe how to obtain a concise description of M. 

Apply the encoding procedure to K and obtain sets S, A such that \A\ < aN, \S\ < aN, A is 
determined by S, and 

S QK C5UJV(5)UA 

By lemma fT2l there exists a local cover Zx of {X} U N(X) C B with \Zx\ < 2 for each X £ S, 
noting that each such X is a dependent set of M. Then 

Z:= \J Z x 

xes 

covers all Y G (5 U JV(fl)) \ and \Z\ < 2|5|. As all members of K \ A lie in S U JV(5), the set 
K \ A is fully determined by {(P, ru (P)) | P G i?}. For the remaining non-bases in K n A, we can 
simply list them. Thus, ({(P, tm(P)) | P G Z},KnA) gives a complete and concise description of 
the non-bases in a matroid. 

This bounds the number of matroids in Wl n ^ r by the number of ways of choosing S from an 
A^-set, times the number of ways of choosing the collection {(P, tm(P)) I P £ Z} from a set of size 
2 n (n + 1), times the number of possible subsets from A. As \A\ < aN, \S\ < \o~N~\ and \Z\ < 2\S\, 
this yields 
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We have \aN] < ^(i n / 2j ), and aN < \ (, n / 2j )(l + o(l)) by lemma M So the bound on aN 
dominates in 

logm n , r < Iog(r<rJ\ri) + 2\aN] log ^ J + aN < - [ J (1 + o(l)). 
Since m n = ]Cr=o m n,r — ( n + 1) m ax r m njr , we also have 
(14) lo gm „<^(^ 2J )(l + „(!)). 

Taking logarithms and applying ([7]) to bound the binomial coefficient, the theorem follows. □ 

Combining the above theorem with Knuth's lower bound on s n fllOf) . we obtain: 
Corollary [H log log m n < log log s n + 1 + o(l). 

7. Further directions 

7.1. The maximal independent sets of the Johnson graph. By Knuths lower bound and 
(HH), we have 

1 / Tl \ I Th \ 

n{[n/2\) * l ° gSn * l ° gmn * an {[n/2\) {1 + ° m 

where a n ~ ^. So asymptotically, there is a factor 2 between lower and upper bound, which turns 
up as the additive +1 term in corollary SJ The lower bound is the size of an independent set in 
J(n, L^/2j) a s constructed by Graham and Sloane [10]. As far as we know, the best general upper 
bound on the size of such an independent set is a n ( ) , as a consequence of corollary [THJ 

It seems that a better understanding of the maximum size of an independent set in J(n, r) could 
lead to better bounds for m n . If it could be shown that J(n, /2j ) actually has an independent 
set of size On(^ n / 2 j)) then the gap of +1 in corollary [4] would disappear. On the other hand, if the 
maximum size of an independent set is at most ^(i n %|)' then a technique to show such an upper 
bound could potentially be useful for bounding m n . 

7.2. The structure of independent sets of the Johnson graph. The upper bound on the 
number of independent sets in the Johnson graph J(n, r) shows that among the sets of vertices 
of size at most a(J(n,r)) there are relatively few independent sets. We ask if there could be 
a structural reason for this, that is, a structure common to all independent sets from which it 
directly follows that there cannot be many independent sets. 

For example, if there exists a 'small' set U of maximal independent sets of J(n, r) and a 'small' 
k G N, such that for each independent set I of J(n, r) there is a J G U such that | J \ J\ < k, then it 
would follow that there are at most \U\( k )2 aN independent sets, where N = (™) and aN bounds 
the cardinality of an independent set of J(n,r). 

7.3. The number of matroids without circuit-hyperplanes. If M = (E, B) is a matroid of 
rank r, then the following are equivalent for a subset X C E: 

• X is both a circuit and a hyperplane of M; and 

• X is an isolated vertex of J(E, r) \ B. 

It is well-known that if X is a circuit-hyperplane of M, then relaxing X gives another matroid 
(E,B U {^}). More strongly, we have 

Lemma 21. Let < r < \E\, let B C ( E Y and let U be a set of isolated vertices of J(E,r) \ B. 
Then 

(E, B) is a matroid <;=^ (E, B U U) is a matroid. 
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By the lemma, each matroid M = (E,B) can be decomposed uniquely as (M, U), where M := 
(E, BUU) and U is the set of all isolated vertices of J(E, r)\B. Then M has no circuit-hyperplanes, 
and U is an independent set of the Johnson graph. Hence, 

m n,r/i(J( n , r )) < \{M £ M n)r | M has no circuit-hyperplanes }|. 

It follows that any upper bound on the number circuit-hyperplane-free matroids will yield an upper 
bound on all matroids, using theorem[6]to bound i(J(n, r)) = s n ^ r . We think that it may be possible 
to directly prove theorem [3] along these lines, and conjecture that the number of circuit-hyperplane- 
free matroids is relatively small. 



Conjecture 22. 



\{M £ M n I M has no circuit-hyperplanes }| 
hm = 0. 



This would also follow from conjecture HJ as the the only sparse paving matroids without circuit- 
hyperplanes are the uniform matroids. 

Mayhew, Newman and Whittle have recently shown that real-representability of matroids cannot 
be captured by a 'natural' axiom, that is, a sentence of monadic second-order logic for matroids, 
which is defined in their paper |17j . We believe that the difficulty may be in the seemingly un- 
managable set of independent sets of the Johnson graph, and conjecture that there is not even a 
natural axiom that captures whether a sparse paving matroid is real-representable. On the other 
hand, we wonder whether this difficulty can be factored out, and ask if there is a natural axiom 
that describes which circuit-hyperplane-free matroids are real-representable. 

7.4. The cover complexity of a matroid. For a matroid M, we define the cover complexity as 

k(M) := min{|Z| | Z C T(M), Z is a flat cover of M}. 
The following lemma is straightforward. 

Lemma 23. Let M be a matroid. Then 

(1) k(M) = k(M*); 

(2) k(M) < «(M/e) + k(M \ e), for any e G E(M); 

(3) if N is a minor of M , then k(M) > k(N); 

(4) if N arises from M by relaxing a circuit-hyperplane, then k(M) = k(N) + 1. 
In \16\ Conj. 1.7] it is conjectured that if N is any sparse paving matroid, then 

\{M £ ML I M does not have an iV-minor)! 
lim 5-! = 0. 

n-+oo m n 

In a forthcoming paper, we will show that the conjecture holds for N = U2 t k and N = U$q, by 
deriving bounds on the cover complexity of matroids not having such a minor N. We pose the 
challenge of bounding 

max{K(M) | M £ M n , M does not have an M(i^4)-minor }. 
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